fix: systematic ty type-checker cleanup across codebase#13481
fix: systematic ty type-checker cleanup across codebase#13481alt-glitch wants to merge 11 commits intomainfrom
Conversation
acd1f17 to
f8b55fb
Compare
kshitijk4poor
left a comment
There was a problem hiding this comment.
Re-review after rebase (v2)
Thanks for rebasing onto current main — the branch is now at 0 commits behind, merge conflicts are gone, and the commit count dropped from 14 to 10 (cleaner history). This resolves all 6 critical blockers and all 4 behavioral regressions from my first review:
✅ Resolved from v1 review
→ Fixed —agent.*removed from setuptoolsagent.*is now preservedGateway hook system gutted→ Resolved by rebase — theemit_collect()hook system remains untouched→ Resolved by rebase — no longer toucheduser_name/chat_idremoved from AIAgent→ Resolved by rebaserelease_gateway_runtime_lock()removedPlugin slash commands removed from Discord→ Resolved by rebaseTransport cache invalidation removed→ Resolved by rebaseMEDIA pattern extensions removed→ Resolved by rebaseiOS smart-dash removal→ Resolved by rebaseDiscord→ Resolved by rebase_slash_commandsconfig removedAuto-prune removed→ Resolved by rebase
The diff is now much cleaner. Here's what remains:
Remaining concerns (non-blocking but worth discussing)
1. Soft dep → hard crash policy (still present)
These changes are still in the PR:
- Pillow:
vision_tools.py,clipboard.py,tui_gateway/server.py— previously degraded gracefully (skip resize, fallback to ImageMagick or raw data URL), now crashes withImportError - tiktoken:
tools_config.py— previously returned empty dict when unavailable, now crashes - mutagen:
discord.py— previously fell through to default 5s duration, now crashes - psutil:
mcp_tool.py— previously returned empty set, now crashes - soundfile:
neutts_synth.py— previously fell back to_write_wav()manual implementation, now crashes - ptyprocess:
process_registry.py— previously fell back to pipe mode with a warning, now crashes
Declaring these in extras ([cli], [messaging], [mcp], [pty], [tts-local]) is good. But the behavioral change from graceful degradation → crash is a separate decision from type cleanup.
Specific concern: The ptyprocess one is particularly risky — the current fallback to pipe mode is intentional and useful. Users who request PTY but don't have ptyprocess installed currently get pipe mode as a reasonable fallback. The crash means PTY requests will hard-fail instead.
Specific concern: The psutil one in _snapshot_child_pids() previously had a broader except Exception fallback that returned an empty set even for non-import failures. The new code catches psutil.Error but if psutil raises something else (e.g., permission error on macOS), it would propagate uncaught. The old fallback-to-empty-set was defensive for good reason.
Specific concern: The soundfile one removes a working _write_wav() fallback — this was real functionality, not silent degradation.
2. read_credential_pool() API change
The function signature changes from (provider_id: Optional[str] = None) to (). A new read_provider_credentials(provider_id: str) is added. The two internal callers are updated correctly. However, read_credential_pool is a public function in hermes_cli.auth — any plugin or external code calling read_credential_pool("openrouter") will now silently get the entire pool dict instead of the provider's entries (the parameter is gone, not rejected). Consider keeping the old signature with a deprecation warning, or at minimum documenting this as a breaking change.
3. api_server.py — aiohttp direct import at module level
The AIOHTTP_AVAILABLE guard is removed and aiohttp is imported unconditionally at module level. The check_api_server_requirements() function is deleted. This is fine because gateway/run.py now does import aiohttp with try/except before importing APIServerAdapter, so the module won't be imported when aiohttp is missing. The guard in run.py is the correct place for it.
However, the connect() method also had a guard: if not AIOHTTP_AVAILABLE: return False — this is now gone. This is fine given the run.py pre-check, but it does mean if someone imports APIServerAdapter directly without the pre-check, they'll get a module-level ImportError rather than a graceful False from connect().
4. _message_handler None guard in base.py
handler = self._message_handler
if handler is None:
return
response = await handler(event)This is a correct ty fix but changes behavior: previously await self._message_handler(event) would raise TypeError: 'NoneType' is not callable if _message_handler was None. Now it silently returns. This is arguably better but could mask initialization bugs.
5. _handle_retry_command return type widened to Optional[str]
- async def _handle_retry_command(self, event: MessageEvent) -> str:
+ async def _handle_retry_command(self, event: MessageEvent) -> Optional[str]:Need to verify callers handle None returns. The gateway dispatch generally does handle None (treats it as "no response"), so this is likely safe.
6. Discord hasattr → isinstance changes in gateway/run.py
Multiple hasattr(adapter, "method") checks are replaced with isinstance(adapter, DiscordAdapter). This is correct per our Pitfall #14 (MagicMock fools hasattr), but it adds from gateway.platforms.discord import DiscordAdapter imports inside each method. The imports are function-level (lazy), so no circular import risk. This is a good change.
7. Config TypedDicts (+361 lines in config.py)
Still present. These add maintenance burden — every new config key needs a TypedDict update. The total=True default means missing keys are type errors. Given how fast config evolves, this will require constant upkeep. Not blocking, but worth considering if the team is willing to maintain these.
8. Script relocation (batch_runner.py → scripts/)
Still present. The py-modules in pyproject.toml is updated (removes batch_runner, trajectory_compressor, rl_cli), scripts is added to packages.find.include, and scripts/__init__.py is created. Logger name changes in cli.py and run_agent.py are correctly updated. Test references are updated. The datagen-config-examples/ scripts are updated. This looks complete.
9. Minor nit: ty: ignore comments
The PR adds # ty: ignore[invalid-return-type] and # ty: ignore[invalid-key] comments in discord.py. These are for a non-standard type checker. If the project doesn't use ty in CI, these are just noise. But they don't hurt anything.
✅ Good changes that look correct
All the items from my v1 "salvageable" list are still present and correct:
callable→Callable[..., Any]across run_agent.py, cli.py, delegate_tool.py, etc.os.sys.modules→sys.modulestypo fix- Discord null-safety guards (interaction.data, thread_id, member_count)
- Telegram sticker None guard
- Matrix walrus-narrowing fix
build_anthropic_client(timeout: Optional[float])wideningOMIT_TEMPERATUREsentinel guard → kwargs-dict patternCredentialPool.remove_entry()method_setup_wecom_callback()addition_CamofoxConfigTypedDictdatetime.utcnow()→datetime.now(timezone.utc)- email.py
send_image/send_documentsignature fixes display.pystart_time assertcron/scheduler.pyNone-safe error extractionskill_utils.pytuple unpacking fiximage_generation_tool.pyfal_client type narrowing (properly typed getattr results)process_registry.pypty import restructuring (though the no-fallback concern above)conversation_history: list | Nonewideningbase.pyunreachable assertions after retry loops- Various return type widenings (get_toolset_info, _handle_retry_command, _prepare_command, etc.)
- Tests for remove_entry, config shapes, RunState fields
Summary
The rebase resolved all the critical and behavioral issues. What remains is:
- The soft→hard crash policy — the biggest philosophical question. Worth a team discussion on whether graceful degradation or fail-loud is preferred for optional deps.
- Minor API breakage —
read_credential_pool()signature change - Maintenance burden — config TypedDicts
The type fixes, null-safety guards, and structural additions (remove_entry, _setup_wecom_callback, RunState fields) are all valuable. If the team is comfortable with the soft→hard crash policy, this is much closer to mergeable. If not, those specific changes could be reverted while keeping everything else.
|
Thanks for the thorough re-review. Addressing the remaining items: 1. Soft dep → hard crash policyThis is intentional. Every dependency referenced in these
The previous "graceful degradation" paths were masking misconfigured installs — silently bypassing SOCKS proxies, skipping image resizing, guessing voice message durations, falling back to pipe mode, etc. If a user installs The error messages all point to the correct extras group ( One propagation fix in 2.
|
Widen return type annotations to match actual control flow, add unreachable assertions after retry loops ty cannot prove terminate, split ambiguous union returns (auth.py credential pool), and remove the AIOHTTP_AVAILABLE conditional-import guard from api_server.py.
Move batch_runner, trajectory_compressor, mini_swe_runner, and rl_cli from the project root into scripts/, update all imports, logger names, pyproject.toml, and downstream test references.
Replace hasattr() duck-typing with isinstance() checks for DiscordAdapter in gateway/run.py, add TypedDict for IMAGEGEN_BACKENDS in tools_config.py, properly type fal_client getattr'd callables in image_generation_tool.py, fix dict[str, object] → Callable annotation in approval.py, use isinstance(BaseModel) in web_tools.py, capture _message_handler to local in base.py, rename shadowed list_distributions parameter in batch_runner.py, and remove dead queue_message branch.
…guards Previously mutagen, aiohttp-socks, tiktoken, Pillow, psutil, datasets, neutts, and soundfile were used behind try/except ImportError with silent fallbacks, masking broken functionality at runtime. Declare each in its natural extra (messaging, cli, mcp, rl, new tts-local) so they get installed, and remove the guards so missing deps crash loudly.
Add TypedDicts for DEFAULT_CONFIG, CLI state dicts (_ModelPickerState, _ApprovalState, _ClarifyState), and OPTIONAL_ENV_VARS so ty can resolve nested dict subscripts. Guard Optional returns before subscripting (toolsets, cron/scheduler, delegate_tool), coerce str|None to str before slicing (gateway/run, run_agent), split ternary for isinstance narrowing (wecom), and suppress discord interaction.data access with ty: ignore.
15 P1 ship-stopper runtime bugs from the ty triage plus the cross-bucket cleanup in run_agent.py. Net: -138 ty diagnostics (1953 -> 1815). Major wins on not-subscriptable (-34), unresolved-attribute (-29), invalid-argument-type (-26), invalid-type-form (-20), unsupported-operator (-18), invalid-key (-9). Missing refs (structural): - tools/rl_training_tool.py: RunState dataclass gains api_log_file, trainer_log_file, env_log_file fields; stop-run was closing undeclared handles. - agent/credential_pool.py: remove_entry(entry_id) added, symmetric with add_entry; used by hermes_cli/web_server.py OAuth dashboard cleanup. - hermes_cli/config.py: _CamofoxConfig TypedDict defined (was referenced by _BrowserConfig but never declared). - hermes_cli/gateway.py: _setup_wecom_callback() added, mirroring _setup_wecom(). - tui_gateway/server.py: skills_hub imports corrected from hermes_cli.skills_hub -> tools.skills_hub. Typo / deprecation: - tools/transcription_tools.py: os.sys.modules -> sys.modules. - gateway/platforms/bluebubbles.py: datetime.utcnow() -> datetime.now(timezone.utc). None-guards: - gateway/platforms/telegram.py:~2798 - msg.sticker None guard. - gateway/platforms/discord.py:3602/3637 - interaction.data None + SelectMenu narrowing; :3009 - thread_id None before `in`; :1893 - guild.member_count None. - gateway/platforms/matrix.py:2174/2185 - walrus-narrow re.search().group(). - agent/display.py:732 - start_time None before elapsed subtraction. - gateway/run.py:10334 - assert _agent_timeout is not None before `// 60`. Platform override signature match: - gateway/platforms/email.py: send_image accepts metadata kwarg; send_document accepts **kwargs (matches base class). run_agent.py annotation pass: - callable/any -> Callable/Any in annotation position (15 sites in run_agent.py + 5 in cli.py, toolset_distributions.py, tools/delegate_tool.py, hermes_cli/dingtalk_auth.py, tui_gateway/server.py). - conversation_history param widened to list[dict[str, Any]] | None. - OMIT_TEMPERATURE sentinel guarded from leaking into call_llm(temperature): kwargs-dict pattern at run_agent.py:7337 + scripts/trajectory_compressor.py:618/688. - build_anthropic_client(timeout) widened to Optional[float]. Tests: - tests/agent/test_credential_pool.py: remove_entry (id match, unknown-id, priority renumbering). - tests/hermes_cli/test_config_shapes.py: _CamofoxConfig shape + nesting. - tests/tools/test_rl_training_tool.py: RunState log_file fields.
Follow-up to 15ac253 per /simplify review: - gateway/platforms/discord.py:3638 - move self.resolved = True *after* the `if interaction.data is None: return` guard. Previously the view was marked resolved before the None-guard, so a None data payload silently rejected the user's next click. - agent/display.py:732 - replace `if self.start_time is None: continue` with `assert self.start_time is not None`. start() sets start_time before the animate thread starts, so the None branch was dead; the `continue` form would have busy-looped (skipping the 0.12s sleep). - tests/hermes_cli/test_config_shapes.py - drop __total__ dunder restatement test (it just echoes the class declaration); trim commit narration from module docstring. - tests/agent/test_credential_pool.py, tests/tools/test_rl_training_tool.py - drop "added in commit ..." banners (narrates the change per CLAUDE.md).
When optional dependencies are missing, raise ImportError with installation instructions pointing to the relevant extras group (e.g. `[messaging]`, `[cli]`, `[mcp]`, etc.) instead of letting the import fail silently.
Remove the unnecessary nudge about agent refactoring; the TODO describes the actual work that needs to be done.
… clipboard Remove 44 TypedDict classes from config.py — they were already stale (11 missing keys) and load_config() still returns Dict[str, Any], so they provided zero type-checking value. Keep the int() coercions and Dict[str, Any] annotations which are real fixes. Fix _wayland_save() swallowing ImportError at DEBUG level by adding an explicit except ImportError: raise before the broad except Exception.
82d6a80 to
4150433
Compare
|
Retraction — my earlier review was wrong on the headline concern. @sidbin caught it. The from agent.transports.types import NormalizedResponse, ToolCall
tc = ToolCall(id="call_123", name="web_search", arguments='{"query":"foo"}')
nr = NormalizedResponse(content=None, tool_calls=[tc], finish_reason="tool_calls")
nr.tool_calls[0].function.name
# 'web_search' — works fineMy earlier repro pulled the Concerns #2 (removed silent import guards for Sorry for the noise. |
Summary
Systematic pass through the codebase fixing diagnostics reported by ty (Astral's type checker), organized by error category. Eliminates hundreds of type errors across ~82 files, cleans up redundant imports, declares missing optional dependencies, removes silent failure patterns, and — most recently — lands a focused P1 hotfix batch for actual runtime-crash bugs surfaced by triaging the remaining diagnostic buckets.
What changed
Type fixes by category
unresolved-reference— AddedTYPE_CHECKINGimports for forward references (gemini adapters, copilot client, etc.); defined missing_CamofoxConfigTypedDict referenced by_BrowserConfig.unresolved-import— Correctedskills_hubimports intui_gateway/server.py(was pointing at the wrapper module, now targets the source). Added_setup_wecom_callback()function thathermes_cli/setup.pywas importing.unresolved-attribute—RunStatedataclass gained missingapi_log_file/trainer_log_file/env_log_filefields (stop-run flow was closing undeclared handles); addedCredentialPool.remove_entry(entry_id)method (symmetric withadd_entry, called during OAuth dashboard cleanup);os.sys.modules→sys.modulestypo fix; None-guards beforemsg.sticker.*andre.search(...).group().invalid-return-type— Widened return annotations to match actual control flow, added unreachable assertions after retry loops ty can't prove terminate, splitread_credential_poolinto two functions to eliminate ambiguous union returns.call-non-callable— Fixed parameter shadowing imported callable (batch_runner.py), narrowed Optional before calling, typed callback dicts properly.not-subscriptable/invalid-key— AddedTypedDictforDEFAULT_CONFIG(and nested configs), CLI state dicts (_ModelPickerState,_ApprovalState,_ClarifyState), coercedstr | Nonebefore slicing. None-guards + variant narrowing beforeinteraction.data["values"]access in Discord select handlers.unsupported-operator— None-guards before arithmetic / membership / comparison onOptionalvalues: Discordthread_id+guild.member_count, display spinnerstart_time, gateway_agent_timeout.invalid-method-override— Email platform'ssend_imageacceptsmetadatakwarg to match base class;send_documentaccepts**kwargs(was droppingmetadataentirely →TypeErrorfor callers passingmetadata=...).invalid-type-form— Replaced builtincallable/anyin annotation position withCallable[..., Any]/Anyacross 20 sites inrun_agent.py,cli.py,tools/delegate_tool.py,toolset_distributions.py,hermes_cli/dingtalk_auth.py,tui_gateway/server.py.invalid-assignment—run_conversation.conversation_historyparam widened tolist[dict[str, Any]] | Noneso the 5 None-reset sites type-check.invalid-argument-type—build_anthropic_client(timeout)widened toOptional[float](body already handled None);OMIT_TEMPERATUREsentinel no longer leaks intocall_llm(temperature: float)— conditional-kwargs-dict pattern inrun_agent.pyandscripts/trajectory_compressor.py.deprecated—datetime.utcnow()→datetime.now(timezone.utc)ingateway/platforms/bluebubbles.py(Python 3.12+ deprecation).Dependency hygiene
pyproject.tomlextras:mutagenandaiohttp-socks→[messaging],tiktokenandPillow→[cli],psutil→[mcp],datasets→[rl], new[tts-local]extra forneutts+soundfile.try/except ImportErrorwith silent fallbacks (returning empty dicts, skipping features, estimating values from file sizes). Now they crash loudly if the correct extra isn't installed, instead of silently degrading.Code cleanup
batch_runner.py,mini_swe_runner.py,rl_cli.py,trajectory_compressor.py) toscripts/directory with updated imports and test references.AIOHTTP_AVAILABLEguard pattern fromapi_server.py— aiohttp is now a direct import since it's a required dep of the messaging extra.P1 runtime-bug hotfix batch
Landed as
15ac253b+67bc4410(simplify follow-up). Triaged the remaining ty diagnostics (1953 across 16 rules) into genuine runtime crashes vs false positives vs annotation gaps. Fixed the 15 P1 ship-stoppers that would fireAttributeError/TypeErroron the happy path:CredentialPool.remove_entry,RunStatelog-file fields,_CamofoxConfigTypedDict,_setup_wecom_callback(),tui_gateway/server.pyskills_hub path.re.search),agent/display.py(start_time),gateway/run.py(_agent_timeout).send_image/send_document.build_anthropic_client(timeout: Optional[float]),conversation_history | None,OMIT_TEMPERATUREsentinel guard.datetime.utcnow().Net P1-batch impact: -138 ty diagnostics (1953 → 1815), with wins on
not-subscriptable(-34),unresolved-attribute(-29),invalid-argument-type(-26),invalid-type-form(-20),unsupported-operator(-18),invalid-key(-9).Tests
tests/agent/test_credential_pool.py—CredentialPool.remove_entrycoverage (id match, unknown-id return, priority renumbering).tests/hermes_cli/test_config_shapes.py—_CamofoxConfig/_BrowserConfigTypedDict shape smoke tests.tests/tools/test_rl_training_tool.py—RunStatelog-file field defaults.Test plan
ty check --python-version 3.13passes for all fixed categoriesuv sync --extra allinstalls newly declared depspytest tests/)pytest tests/agent/test_credential_pool.py tests/hermes_cli/test_config_shapes.py 'tests/tools/test_rl_training_tool.py::TestRunStateLogFileFields')